For my corpus, I will use two of the playlists that spotify made for me. The first playlist is “Jouw topnummers van 2020” and the second playlist is “Jouw topnummers van 2021”. What I find interesting about these playlists is that they are in some way representative of the music that I listened to in 2020 and 2021. I’m interested in seeing if there are specific things that have changed when it comes to my music taste. I think the tracks in these playlists are quite representative when it comes to the music that I listened to during those periods of time.
When comparing two tracks, it seemed most logical to me to compare the number one songs from both years. For 2020 that song is : Why Why Why Why Why - Sault For 2021 that song is: I know you, I live you - Chaka Khan
Some features that I will look into include valence, energy, tempo, key and mode. I am not sure what my general preference is when it comes to these features, so I am interested to see if there are some things that stand out.
On the side, you can see a little bit of the tables that contain the information about my toptracks from 2020 and 2021. The full table consists of 100 tracks per playlist, with 60 columns containing information about these tracks.
The 2020 playlist will be represented by the color red in some of the graphs, while the 2021 playlist will be represented by the color blue.
Some typical tracks from the 2020 playlist:
Why Why Why Why Why - Sault
Colors - Black Pumas
Exit music (for a film) - Radiohead
Blue World - Mac Miller
H.f.g.w (Canyons Drunken Rage) - Tame Impala
Some atypical tracks from the 2020 playlist:
Fam - sor
Daisy - Ashnikko
Fragments of stasimon of Orestes by Euripides - Petros Tabouris
Some typical tracks from the 2021 playlist:
I know you, I live you - Chaka Khan
You Don’t Listen - General Elektriks
Famous - The Internet
Blackstar - David Bowie
Exit music (for a film) - Radiohead
Some atypical tracks from the 2021 playlist:
Temporary - Lauren Jauregui
SHUM - Go_A
Symphony No.5: IV. Adagietto. Sehr Langsam - Gustav Mahler
| key | loudness | mode | speechiness | acousticness | instrumentalness |
|---|---|---|---|---|---|
| 9 | -6.640 | 0 | 0.0357 | 0.338000 | 6.16e-02 |
| 0 | -5.500 | 1 | 0.0307 | 0.000566 | 1.01e-05 |
| 3 | -8.269 | 0 | 0.0369 | 0.108000 | 1.56e-01 |
| 2 | -7.738 | 1 | 0.1530 | 0.478000 | 2.81e-03 |
| 9 | -8.810 | 1 | 0.0496 | 0.009000 | 8.22e-04 |
| 1 | -8.219 | 1 | 0.0286 | 0.533000 | 2.50e-01 |
| 4 | -7.571 | 0 | 0.0620 | 0.007690 | 0.00e+00 |
| 10 | -8.679 | 1 | 0.3340 | 0.118000 | 0.00e+00 |
| 11 | -5.704 | 1 | 0.0332 | 0.047900 | 5.98e-03 |
| 7 | -8.922 | 0 | 0.0576 | 0.670000 | 4.29e-01 |
| key | loudness | mode | speechiness | acousticness | instrumentalness |
|---|---|---|---|---|---|
| 7 | -9.506 | 0 | 0.0417 | 0.0722 | 1.51e-02 |
| 1 | -7.288 | 1 | 0.0419 | 0.3920 | 4.42e-02 |
| 5 | -9.563 | 0 | 0.0288 | 0.3200 | 9.72e-04 |
| 9 | -5.724 | 1 | 0.0413 | 0.0328 | 1.34e-01 |
| 2 | -6.636 | 1 | 0.0431 | 0.0717 | 7.03e-03 |
| 4 | -7.737 | 1 | 0.0576 | 0.2100 | 0.00e+00 |
| 4 | -6.005 | 0 | 0.1460 | 0.1770 | 1.18e-04 |
| 10 | -12.529 | 1 | 0.0330 | 0.7580 | 2.67e-04 |
| 1 | -9.473 | 1 | 0.0587 | 0.8080 | 1.64e-05 |
| 7 | -3.727 | 1 | 0.0439 | 0.0492 | 5.17e-03 |
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0440 0.4387 0.5535 0.5581 0.6700 0.9310
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00639 0.44575 0.58200 0.57077 0.71400 0.97600
In the histogram, you can see the energy count per playlist. The included summary shows that both the median and the mean are not too far apart when you compare the 2020 and 2021 playlists. There is however, some difference when you look at the minimum and maximum energy. In my opinion this would mean that in 2021 I listened to a wider variety of music when it comes to energy.
The second chart shows that the 2020 playlist has a wider density. This could be explained by the minimum and maximum being less far apart, which makes the density wider.
[1] 0.455756
[1] 0.2139411
Energy valence
The first graph shows the relation between energy and valence for both 2020 and 2021. I also calculated the correlation between these two variables. That calculation suggests that there is more correlation between energy and valence in the 2020 playlist than in the 2021 playlist. This is visually confirmed by the second chart, which in addition to energy and valence, also shows minor and major. The values seem to be more scattered in the 2021 graph.
Something I found interesting about this graph is that in the 2020 chart, minor songs generally score higher than major songs when it comes to energy. Something I think could explain this is that the top ten songs with the highest energy are all (hard)rock songs. These songs are often quite high in energy, even when they are in a minor key.
Another thing I found interesting about this graph is that the song with both the least energy and valence in the 2021 chart, is actually a song that does not really belong in the playlist. Last year I took a Musicological History course, for which we had to take a listening test, to prove that we were able to recognize a song by hearing it. One of the songs that I struggled with while studying was Symphony No. 5: IV. Adagietto. Sehr langsam by Gustav Mahler. This is the song with the lowest energy and valence. This means that it is not really a representative song, since I didn’t listen to it because I wanted to, but because I had to.
Danceability says something about how suitable a track is for dancing. This is based on multiple musical elements. Some of those elements are energy, tempo, rhythm stability and beat strength. The 2020 playlist has a higher median and a higher mean when it comes to the danceability variable. What I find interesting about this is that the 2021 playlist had a higher mean energy. There was barely any difference between the two playlists when it comes to the average tempo, with the 2021 playlist scoring 1.5 BPM higher on the average tempo. When I look at the top ten songs with the highest danceability for both playlists, I don’t see any differences that would give a logical explanation as to why the 2020 playlist has a higher mean danceability. When I look at the top ten songs with the lowest danceability, it makes more sense that the 2021 playlist scored lower. There are two songs from a classical music playlist that I had to listen to for a school assignment. Besides that I listened to more rock songs in 2021, and a few of those songs apparently score really low on danceability.
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.2930 0.5745 0.6835 0.6737 0.7833 0.9460
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0618 0.5102 0.6410 0.6224 0.7455 0.9490
| danceability | track.name |
|---|---|
| 0.293 | Exit Music (For a Film) |
| 0.303 | Feels Like We Only Go Backwards |
| 0.360 | Karma Police |
| 0.399 | Fragment of stasimon of Orestes by Euripides - Ancient manuscript |
| 0.418 | High And Dry |
| 0.430 | Why Don’t You |
| 0.444 | my future |
| 0.447 | Sorry Ain’t Enough |
| 0.452 | Freakin’ Out On the Interstate |
| 0.462 | Red Eyes |
| danceability | track.name |
|---|---|
| 0.0618 | Symphony No. 5: IV. Adagietto. Sehr langsam |
| 0.2010 | Electioneering |
| 0.2760 | Sonata I.X.: I. Foreboding |
| 0.2930 | Exit Music (For A Film) |
| 0.3190 | Blouse |
| 0.3320 | Yesterday - Remastered 2009 |
| 0.3420 | Bodysnatchers |
| 0.3500 | Bury Me |
| 0.3660 | Blackstar |
| 0.3660 | Knights of Cydonia |
An observation that I made is that the song with the highest tempo, can actually be viewed in halftime. It’s the song parachutes by Jordan Mackampa. The song is in my opinion in halftime so that would mean that it is not a 193 BPM but 97 BPM.
The tempo is the highest for songs with a 4/4 time signature.
Even though the 2021 playlist has less outliers, it has a wider range when it comes to the lower and the upper fence. This can be seen in the boxplot.
When at the songs that fall above the upper fence for the 2020 playlist, I can see that they have all been synced to a wrong tempo octave. Non of the top 5 highest tempo songs from that playlist actually have a higher tempo than 158 BPM, the upper fence.
# A tibble: 1 × 3
minTempo meanTempo maxTempo
<dbl> <dbl> <dbl>
1 54.6 113. 194.
# A tibble: 1 × 3
minTempo meanTempo maxTempo
<dbl> <dbl> <dbl>
1 66.0 114. 180.
Both of the tempograms show issues with correctly identifying the songs in the right tempo octave. The 2021 tempogram is even more flawed than the 2020 tempogram, since it barely shows any difference between 113 bpm and the other tempo’s.
| playlist_name | track.name | tempo |
|---|---|---|
| Your Top Songs 2020 | Why Why Why Why Why | 114.762 |
| playlist_name | track.name | tempo |
|---|---|---|
| JTNV2021 | I Know You, I Live You | 113.229 |
The most used key in 2021 is C#-major. The most used key in 2020 is A-minor. The 2021 playlist contains 53 songs in a major key, and 47 in a minor key. The 2020 playlist contains 56 songs in a minor key and 44 in a major key.
This can be linked to the mean valence, energy and tempo for both playlists as well. Those three features are all higher for the 2021 playlist. From this the conclusion can be drawn that the 2021 playlist was a bit ‘happier’ than the 2020 playlist.
[1] 0.485539
Mean Valence 2021
[1] 0.503828
| loudness | track.name |
|---|---|
| -20.045 | Epitaph of Seikilos |
| -14.903 | I Loved Another Woman |
| -14.602 | 40 Nights |
| -14.002 | Sigurt - Remix |
| -13.418 | Sorry Ain’t Enough |
| -13.013 | Te extraño, pero… |
| -12.724 | Fragment of stasimon of Orestes by Euripides - Ancient manuscript |
| -12.706 | Don’t Let Get You Down - Edit |
| -12.304 | Goodie Bag |
| -11.943 | Territory |
| loudness | track.name |
|---|---|
| -34.643 | Symphony No. 5: IV. Adagietto. Sehr langsam |
| -24.664 | Delicate Felt |
| -20.638 | Blouse |
| -20.428 | Sonata I.X.: I. Foreboding |
| -19.132 | Maria Elena |
| -18.204 | Hold On - Remastered 2010 |
| -16.223 | Hazey - Stripped |
| -15.351 | Why Try to Change Me Now |
| -15.297 | Sunshine |
| -13.924 | Day Dreaming - 2021 Remaster |
The means and medians for both playlists are quite similar. However, the 2021 shows a broader distribution of the quantiles. The 2021 playlist also shows more outliers. The outlier with -35dB is a classical piece that I had to listen to for school. Multiple of the outliers are classical music, which would explain why they may have a lower loudness value.
[1] -8.54666
Mean Loudness 2021
[1] -8.78596
As you can see, the mean for acousticness is higher for the 2021 playlist and the mean for speechiness is higher for the 2020 playlist. The ‘outlier’ that can be seen in the histogram for the 2020 playlist is a rap song called D/Vision by JID. The fact that it is a rap song explains the high level of speechiness. Most of the songs with high acousticness are classical songs, of which some are from a classical playlist that I had to study for a school assignment.
Something that is interesting to me is that the 2021 playlist shows a negative correlation when it comes to acousticness and speechiness.
Correlation 2020, 2021 - Acousticness, Speechiness[1] 0.02179429
[1] -0.08261927
[1] 0.09523
mean speechiness 2021
[1] 0.081393
mean acousticness 2020
[1] 0.2526544
mean acousticness 2021
[1] 0.2772244
Pitch
Why Why Why Why Why During the song, D is the most used pitch. Other common pitches are F and A. This could be explained by the fact that a D-Minor chord consists of D-F-A. Another pitch that seems to have a high magnitude is Db. An explanation could be that the song features the 7th tone often, but I think it is more likely that the 7th tone here is C, which is also shown to have a relatively high magnitude for some parts of the song. In my opinion it is more likely that the D was read by the algorithm as a Db.
I Know You, I Live You The pitches that have the highest magnitude are C, Db, D, G and A. According to the spotify analysis the key for this song is Gm. This would explain the high magnitude for the G and D pitch. The dominant key for Gm is Dm, which would explain the high magnitude for the A tone, since the D-minor chord consists of D-F-A.
Timbre It can be said that for both the 2020 and the 2021 song the lower coefficients capture more of the timbre information. ‘Why Why Why Why Why’ shows a high magnitude for c01, c02 and c03. ‘I Know You, I Live You’ mainly shows a high magnitude for c02.
The structure of the 2020 top track, ‘Why Why Why Why Why’, can be seen as an A-B-C-B-C-B structure.The parts of the song that are instrumental are often found at the beginning of a new letter, for example, the darker blue block at duration 0-16, the intro and 59-65. This is a song that features a lot a variation between the different verses and choruses. I think that explains why the matrix is not the cleanest.
The structure of the 2021 top track, ‘I Know You, I Live You’ can be seen as an A-B-A-B-B structure.
Some songs that are linked in the 2020 dendrogram include ‘She Can’t Love You’ and ‘Mad At Me’. These songs are both R&B songs. ‘Back Pocket’ and ‘Lockdown’ are both songs in which the baseline plays an important role. They also have a similar timbre coefficient. The top song for 2020, ‘Why Why Why Why Why’, is clustered with ‘Breeze’, which is the 18th song on that playlist. They have similar values for speechiness, timbre coefficient c02 and liveness.
Some songs that are linked in the 2021 dendrogram include ‘Krunk’ and ‘Little Lady’. These songs being linked surprised me, because they don’t sound very similar to me. When looking at the heat map, I can see that they are actually quite similar if you look at speechiness, loudness and key. These are some features that I don’t necessarily notice consciously when I listen to music, but that are clearly still important for clustering music. Another cluster that surprised me is ‘You Oughta Know’ and ‘Famous’, one being a rock song, while the other is R&B. They are similar when you look at the timbre features, having very similar values for multiple timbre coefficients. They also have similar values for instrumentalness and danceability.
Some of the features that I analysed for this portfolio show high similarity between the 2020 playlist and the 2021 playlist. They gave similar energy histograms, both used 4/4 the most for time signature, and had almost the same tempo mean and median. Other features like mean acousticness and median loudness were almost the same for both playlists.
There were also some features that were less similar. The 2021 playlist showed a broader width for the valence quantiles. The songs were also more scattered around the 2021 valence-energy-mode graph than in the 2020 graph. In other words, the 2020 playlist showed more correlation between energy and valence.
Both of the top songs in these playlists showed a high magnitude for the pitched D and Db, which is interesting to me because D and Db were the number 2 and 3 most used keys for 2021, but they were way less used in 2020.
And then there were the cases of features that didn’t show a lot of similarity, but that were in my opinion influenced by songs that didn’t belong in my top tracks. For instance, the danceability data was influenced by a couple of tracks that ended up in my top 100 because I had to study them for a school assignment. These same songs also influenced the loudness and speechiness features. There were also some problems with analysing the tempo features, since some of the outliers were in my opinion in the wrong tempo octave.